Skip to content

Conversation

@Umang01-hash
Copy link
Member

@Umang01-hash Umang01-hash commented Aug 7, 2025

Pull Request Template

Description:

What is DBResolver?

  • Adds a DBResolver module to GoFr, which provides automatic read/write splitting for SQL databases.

  • HTTP Method: Write operations (POST, PUT, PATCH, DELETE) → Primary |
    Read operations (GET) → Replicas

  • Seamlessly wraps the existing SQL datasource: does not require any application code changes for existing queries.

  • Developers interact with c.SQL exactly as before; all routing and failover are fully transparent.

Motivation & Benefits

  • Scalable horizontal read performance with multiple replicas
  • Reduced load on primary database
  • Fault-tolerant: automatic fallback to primary if all replicas fail
  • Clean metrics and tracing support for operational visibility

Example Usage:

package main

import (
	"gofr.dev/pkg/gofr"
	"gofr.dev/pkg/gofr/datasource/dbresolver"
)

type Customer struct {
	ID   int    `db:"id"`
	Name string `db:"name"`
}

func main() {
	a := gofr.New()

	// Initialize DB resolver with default settings
	err := dbresolver.InitDBResolver(a, &dbresolver.Config{
		Strategy:      dbresolver.StrategyRoundRobin, // use round-robin strategy or random strategy
		ReadFallback:  true, // allow reads on primary if all replicas are down
		MaxFailures:   3, 			  // number of allowed failures before marking a replica as down
		TimeoutSec:    30, // timeout for marking a replica as down
		PrimaryRoutes: []string{"/admin", "/api/payments/*"}, // routes that should go to primary

		Replicas: []dbresolver.ReplicaCredential{
			{
				Host:     "localhost:3307",
				User:     "replica_user1",
				Password: "pass1",
			},
			{
				Host:     "replica2.example.com:3308",
				User:     "replica_user2",
				Password: "pass2",
			},
			{
				Host:     "replica3.example.com:3309",
				User:     "replica_user3",
				Password: "pass3",
			},
		},
	})
	if err != nil {
		a.Logger().Errorf("failed to initialize db resolver: %v", err)
	}

	// Read endpoint - goes to replica
	a.GET("/customers", func(c *gofr.Context) (interface{}, error) {
		var customers []Customer

		c.SQL.Select(c, &customers, "SELECT id, name FROM customers")

		return customers, err
	})

	// Write endpoint - goes to primary
	a.POST("/customers", func(c *gofr.Context) (interface{}, error) {
		var customer Customer

		c.Bind(&customer)

		_, err := c.SQL.Exec("INSERT INTO customers (name) VALUES (?)", customer.Name)

		return customer, err
	})

	// Admin endpoint - forced to primary
	a.GET("/admin/customers", func(c *gofr.Context) (interface{}, error) {
		var customers []Customer

		c.SQL.Select(c, &customers, "SELECT id, name FROM customers")

		return customers, err
	})

	a.Run()
}

Configuration Example:

DB_HOST=localhost
DB_USER=root
DB_PASSWORD=rootpassword
DB_NAME=testdb
DB_PORT=3306
DB_DIALECT=mysql
DB_MAX_IDLE_CONNECTION=2
DB_MAX_OPEN_CONNECTION=0

Testing Strategy:

  • Primary and Replicas launched with docker-compose (primary:3306, replicas:3307/3308)

  • Replication automated by setup scripts; seed data from SQL dump and app endpoints

  • Load Testing: Performed with Apache JMeter simulating high-concurrency API requests (both reads and writes)

Results:

  • Zero error rate; throughput and latency showed no regression compared to the baseline.

  • Read/write split is fully performant, and scaling is achieved without application change.

Checklist:

  • I have formatted my code using goimport and golangci-lint.
  • All new code is covered by unit tests.
  • This PR does not decrease the overall code coverage.
  • I have reviewed the code comments and documentation for clarity.

Thank you for your contribution!

Umang01-hash and others added 17 commits July 28, 2025 16:36
Bumps [go.opentelemetry.io/otel/exporters/prometheus](https://github.com/open-telemetry/opentelemetry-go) from 0.59.0 to 0.59.1.
- [Release notes](https://github.com/open-telemetry/opentelemetry-go/releases)
- [Changelog](https://github.com/open-telemetry/opentelemetry-go/blob/main/CHANGELOG.md)
- [Commits](open-telemetry/opentelemetry-go@exporters/prometheus/v0.59.0...exporters/prometheus/v0.59.1)

---
updated-dependencies:
- dependency-name: go.opentelemetry.io/otel/exporters/prometheus
  dependency-version: 0.59.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Bumps [google.golang.org/api](https://github.com/googleapis/google-api-go-client) from 0.243.0 to 0.244.0.
- [Release notes](https://github.com/googleapis/google-api-go-client/releases)
- [Changelog](https://github.com/googleapis/google-api-go-client/blob/main/CHANGES.md)
- [Commits](googleapis/google-api-go-client@v0.243.0...v0.244.0)

---
updated-dependencies:
- dependency-name: google.golang.org/api
  dependency-version: 0.244.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Bumps [gofr.dev](https://github.com/gofr-dev/gofr) from 1.42.4 to 1.42.5.
- [Release notes](https://github.com/gofr-dev/gofr/releases)
- [Commits](v1.42.4...v1.42.5)

---
updated-dependencies:
- dependency-name: gofr.dev
  dependency-version: 1.42.5
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
@gizmo-rt
Copy link
Contributor

@Umang01-hash Can you add more details on sequence of R(Read) and W(Write) cases under which read and write replicas would be selected ?

@Umang01-hash
Copy link
Member Author

@Umang01-hash Can you add more details on sequence of R(Read) and W(Write) cases under which read and write replicas would be selected ?

Hey @gizmo-rt we first determine is a query is read/write and if it is a read query we check if healthyReplica is available or not. If the replica is available we select it and send the read query to it and if the replica is not available we send it to primary
database. Methods like QueryContext, Select have this logic. Methods like Exec , Prepare etc are directly using the primary db. Transaction methods (Begin, BeginTx) are always routed to the primary.

If all replicas are unhealthy at time of a read query and fallback is enabled, the read falls back to the primary. If fallback is disabled, the read fails with an error.

If a replica experiences multiple failures, it's circuit breaker opens and replica is temporarily skipped for queries. This timeout period is 30 seconds by default and we allow 5 failures for replica before cicuit breaker is opened.

For an example sequence like R,R,W,R:

1st R: Routed to replica

2nd R: Routed to next replica (depends on which strategy is choosen random or round-robin)

1st W: Routed to primary

3rd R: Routed to next replica

@coolwednesday
Copy link
Member

coolwednesday commented Nov 21, 2025

As reported by @Umang01-hash, here is a metric for other people's reference. We are good to go with this PR.

Before :
Screenshot 2025-11-21 at 3 12 57 PM

Last 10 min , requests Served :
Screenshot 2025-11-21 at 3 15 05 PM

After :
Screenshot 2025-11-21 at 3 12 42 PM
Last 10 min Requests, Served :
Screenshot 2025-11-21 at 3 15 15 PM

coolwednesday
coolwednesday previously approved these changes Nov 21, 2025
@coolwednesday coolwednesday dismissed akshat-kumar-singhal’s stale review November 26, 2025 06:24

LGTM. Since the PR has been pending for a while, we’ll go ahead and merge it.

@coolwednesday coolwednesday merged commit 2aa8ece into development Nov 26, 2025
19 checks passed
@coolwednesday coolwednesday deleted the en/db_resolver branch November 26, 2025 10:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants